Using Typography in Document Image Analysis
نویسندگان
چکیده
Even if font usage plays an important role in Document Image Analysis (DIA), recognition systems generally take the concept of font management in a weaker sense than in the production cycle. With the point of view of the document recognition community, we show how typographic information (characters bitmap, metrics, etc.) can improve existing analysis methods. After a brief survey of font recognition issues, we present the advantages of a font software support in the design of recognition systems. Concrete algorithms are proposed in the subtopics of a posteriori font recognition, monofont Optical Character Recognition (OCR), and word segmentation. The reported experiments and results indicate that there are still substantial benefits to expect from the design of typographyaware analyzers.
منابع مشابه
Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملPersian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملAnalyzing the Communicative Functions in Typography (the Posters of Asma’ol Hosna in Iran) Using Jakobson’s Approach
The present study attempts to address the issue of typographic communicational methods in posters. The purpose is to investigate the visual elements in creating the communicative functions of typographies of Asma’ol Hosna’s posters based on Jakobson’s communication theory. The question is: By what visual elements are the communicative functions in typography of posters this study propounded? T...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کامل